AI and narrative embeddings detect PTSD following childbirth via birth stories

Free-text analysis using machine learning (ML)-based natural language processing (NLP) shows promise for diagnosing psychiatric conditions. Chat Generative Pre-trained Transformer (ChatGPT) has demonstrated preliminary initial feasibility for this purpose; however, whether it can accurately assess mental illness remains to be determined. This study evaluates the effectiveness of ChatGPT and the text-embedding-ada-002 (ADA) model in detecting post-traumatic stress disorder following childbirth (CB-PTSD), a maternal postpartum mental illness affecting millions of women annually, with no standard screening protocol. Using a sample of 1295 women who gave birth in the last six months and were 18+ years old, recruited through hospital announcements, social media, and professional organizations, we explore ChatGPT’s and ADA’s potential to screen for CB-PTSD by analyzing maternal childbirth narratives. The PTSD Checklist for DSM-5 (PCL-5; cutoff 31) was used to assess CB-PTSD. By developing an ML model that utilizes numerical vector representation of the ADA model, we identify CB-PTSD via narrative classification. Our model outperformed (F1 score: 0.81) ChatGPT and six previously published large text-embedding models trained on mental health or clinical domains data, suggesting that the ADA model can be harnessed to identify CB-PTSD. Our modeling approach could be generalized to assess other mental health disorders.


Appendix A Steps to Build and Test Model #3 of This Study
The following four steps describe how we built and tested Model #3: Step 1. Define a PCL-5 cutoff score.We labeled each narrative as Class 1: Probable CB-PTSD ('CB-PTSD') based on PCL-5 ≥ 31, or Class 0: No Probable CB-PTSD ('No CB-PTSD') based on PCL-5 < 31.
Step 2. Data preparation.We discarded narratives with < 30 words from the dataset [47,48].To handle the imbalance in the analyzed dataset (due to the small representation of cases with PCL-5 ≥ 31), we randomly sampled the majority Class 0 to fit the size of the minority Class 1.Using the balanced dataset, we randomly selected 70% of the narratives to train our model and 30% to test our model.This step was repeated 10 times.
Step 3. Develop a Machine Learning (ML) classifier that utilizes Natural Language Processing (NLP) features.Using the Train set, we develop a model that analyzes pairwise narrative (sentence) data.The goal of the developed model is to learn how to identify semantically or contextually similar pairs of sentences.First, each sentence (Si) is mapped onto a fixed-size embedding vector using the text-embeddingada-002 model via OpenAI API, which we denote emb(Si).Thus, Si is encoded to a vector using a function emb(Si).To learn if two sentences are semantically or contextually similar, we train a classifier to analyze their Hadamard product (denoted: •) [54] and decide whether they affiliate with the same class or not.Given a sentence Sa not present during training, a sentence S1 ∈ Class 1, and a sentence S0 ∈ Class 0, then Sa ∈ Class 1 if the probability of Class 1 affiliation when applying our developed model to the Hadamard product emb(Sa) • emb(S1) is higher than the probability of Class 0 affiliation when applying our developed model to 3. A densely connected feedforward neural network (DFNN) was trained to classify pairs of sentences (by processing vector z) as semantically similar or not.
Step 4. Test model performance.We compared the performance of Model #3 with the model in [47], as well as with Model #1 and Model #2.We report the area under the receiver operating characteristic curve (AUC), F1 score, Sensitivity, and Specificity measures on the Test set.To test Model #3 on a newly unseen narrative S in the Test set, we first compute its embeddings.Next, we calculate the average embedding vector (vn) of all Train narratives in Class 0, and the average embedding vector (vp) of all Train narratives in Class 1.To decide the class of S, we compute zn = (emb(S) • vn), and zp = (emb(S) • vp).Then, we apply Model #3 (denoted as f (x)) to zn and zp, and compare its output, i.e., compare the likelihood of similarity of emb(S) to vp with the likelihood of similarity of emb(S) to vn.If f (zp) > f (zn), we say that S ∈ Class 1, else S ∈ Class 0. Intuitively, our model should assign a higher likelihood of similarity between an embedded narrative of a woman with CB-PTSD to the vector vp than to the vector vn.
This approach allowed us to generate multiple training examples since there are n(n − 1)/2 possible combinations for n sentences, thus addressing the challenge of training an ML model with a low number of examples, as in Class 1.More specifically, the following three substeps describe the model development.1.Each pair of sentences in Class 1, and each pair of sentences in Class 0, were labeled as positive examples, indicating semantically or contextually similar sentences of individuals with (Set #1) or without (Set #2) CB-PTSD, respectively.Next, negative examples (Set #3) of the same size as the positive examples sets (||Set #1|| + ||Set #2||) were created by randomly selecting pairs of sentences, one from Class 1, and the other from Class 0, indicating semantically or contextually nonsimilar sentences.2. Using the text-embedding-ada-002 model, each sentence was mapped into a dense vector space.Then, for each Set #1 to #3, we computed a vector z of the embedding (emb) of each pair of sentences (u, v), selected in Substep 1, z = (emb(u) •emb(v)).

Figure A1 .
Figure A1.The modeling approach for classifying pairs of narratives associated with women with or without CB-PTSD.